FULVIO CASTELLO, LAVINIA COMERRO, MARCOS RICARDO LAWRIE

## DLX PRO (J)

#### DEVELOPMENT PROGRESSION \*\*

- Calendar: 3 people working for 5 months, including lab works and theoretical lectures
- Approach: entirely bottom-up, parallelizing implementation and V&V practices
- Main focus: creating a cohesive, modular design that could easily be worked on and even resumed by external people as well
- Outcome(s): "pro" version of the assigned project-properly tested and documentedalong with a posthumously improved version.



#### TESTING PROCEDURE 😂

- Purpose: exploiting the provided remote server in order to ensure the behavioral correctness of each component and the interrelation between multiple of them
- Lower level: trivial testbenches built to easily check 100% of the functionalities implemented within each basic block-simulation has to be run "by hand"
- Composite structures: creation of CLI automation scripts aimed at reporting the ensuing results to the user with a "black-box" approach
- Methodology: heavy use of VHDL assertions that keep track of individual module coverage, with the intent of making test suites self-checking.

## DATAPATH ==

- > 32-bit data parallelism
- 5-stage pipeline made of corresponding blocks
- 32 general purpose integer registers (with const. r0)
- 4 main condition code flags {N, Z, C, V}
- Complete/flexible support for extended instruction set.



## REVISION \*\*

- Sign extend [alternate] shifted:
  MEM ▶ WB
- Program Counter selection
  multiplexer moved: MEM ▶ EXE
- Branch condition registers
  discarded: EXE ⟨∞, MEM ⟨∞
- Control signals properly edited.



### DESIGN IMPROVEMENTS

- Patched: attempting to load reduced-width values from the Data Memory would previously cause synchronization faults, thus invalidating results
- Improved: the "don't care" condition associated to the final 4 to 1 multiplexer was removed, further augmenting the overall architectural regularity
- Sped-up: branching operations now take 2 clock cycles less to occur, reducing the amount of void actions the pipeline may take thereafter
- Control Unit: control bits have slightly changed and increased in number, but two of them are internally shared between stages—word complexity unaltered.

## FINAL BENCHMARKS '/-

| Metric                 | Previous | Current  | Difference |
|------------------------|----------|----------|------------|
| $A_{cell}$             | 17.94 µm | 17.87 µm | -0.37%     |
| $T_{clock}$            | 3.00 ns  | 3.18 ns  | +6.00%     |
| P <sub>internal</sub>  | 3.92 mW  | 3.71 mW  | -5.56%     |
| P <sub>switching</sub> | 0.29 mW  | 0.28 mW  | -5.21%     |
| P <sub>dynamic</sub>   | 4.21 mW  | 3.98 mW  | -5.54%     |
| P <sub>leakage</sub>   | 0.35 mW  | 0.35 mW  | -0.45%     |
| P <sub>total</sub>     | 4.57 mW  | 4.33 mW  | -5.14%     |
| Functionality          | Partial  | Complete | 「(ツ)_/     |

# THE MICROELECTRONIC SYSTEMS COURSE @ POLITO (A.Y. 2021/2022).

ms22.14